Overview

Dataset statistics

Number of variables23
Number of observations87813
Missing cells90411
Missing cells (%)4.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.4 MiB
Average record size in memory184.0 B

Variable types

Numeric8
Categorical15

Warnings

tipo_persona has constant value "Natural" Constant
sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
valor_credito_smdlv is highly correlated with cuotasHigh correlation
cuotas is highly correlated with valor_credito_smdlvHigh correlation
municipio_residencia is highly correlated with municipio_credito and 2 other fieldsHigh correlation
periodo_credito is highly correlated with Row and 1 other fieldsHigh correlation
sueldo_smdlv is highly correlated with otros_ingresos_smdlvHigh correlation
municipio_credito is highly correlated with municipio_residencia and 2 other fieldsHigh correlation
municipio_expedicion is highly correlated with municipio_residencia and 3 other fieldsHigh correlation
cuotas is highly correlated with forma_pagoHigh correlation
codeudor is highly correlated with RowHigh correlation
genero is highly correlated with RowHigh correlation
forma_pago is highly correlated with cuotas and 3 other fieldsHigh correlation
año_credito is highly correlated with forma_pagoHigh correlation
Row is highly correlated with periodo_credito and 5 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
municipio_nacimiento is highly correlated with municipio_residencia and 3 other fieldsHigh correlation
estado_final is highly correlated with periodo_credito and 1 other fieldsHigh correlation
municipio_expedicion is highly correlated with tipo_persona and 1 other fieldsHigh correlation
sector is highly correlated with tipo_persona and 1 other fieldsHigh correlation
tipo_persona is highly correlated with municipio_expedicion and 13 other fieldsHigh correlation
procedencia is highly correlated with tipo_personaHigh correlation
municipio_credito is highly correlated with tipo_persona and 1 other fieldsHigh correlation
tipo_venta is highly correlated with tipo_personaHigh correlation
municipio_residencia is highly correlated with tipo_persona and 1 other fieldsHigh correlation
genero is highly correlated with tipo_personaHigh correlation
forma_pago is highly correlated with sector and 2 other fieldsHigh correlation
periodo_credito is highly correlated with tipo_personaHigh correlation
estado_final is highly correlated with tipo_persona and 1 other fieldsHigh correlation
tiene_casa_propia is highly correlated with tipo_personaHigh correlation
municipio_nacimiento is highly correlated with municipio_expedicion and 1 other fieldsHigh correlation
estado_civil is highly correlated with tipo_personaHigh correlation
codeudor is highly correlated with tipo_personaHigh correlation
sueldo_smdlv has 8704 (9.9%) missing values Missing
otros_ingresos_smdlv has 81706 (93.0%) missing values Missing
Row has unique values Unique

Reproduction

Analysis started2021-05-12 19:50:48.491264
Analysis finished2021-05-12 19:51:26.213323
Duration37.72 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Row
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct87813
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68957.67679
Minimum11674
Maximum126896
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum11674
5-th percentile18527.6
Q140765
median68886
Q397273
95-th percentile120496.4
Maximum126896
Range115222
Interquartile range (IQR)56508

Descriptive statistics

Standard deviation32475.02055
Coefficient of variation (CV)0.4709413377
Kurtosis-1.166914784
Mean68957.67679
Median Absolute Deviation (MAD)28244
Skewness0.0278328322
Sum6055380472
Variance1054626960
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
204701
 
< 0.1%
1052541
 
< 0.1%
171811
 
< 0.1%
233261
 
< 0.1%
212791
 
< 0.1%
1093441
 
< 0.1%
1134421
 
< 0.1%
1113951
 
< 0.1%
1011561
 
< 0.1%
991091
 
< 0.1%
Other values (87803)87803
> 99.9%
ValueCountFrequency (%)
116741
< 0.1%
116751
< 0.1%
116771
< 0.1%
116781
< 0.1%
116801
< 0.1%
116821
< 0.1%
116841
< 0.1%
116861
< 0.1%
116901
< 0.1%
117011
< 0.1%
ValueCountFrequency (%)
1268961
< 0.1%
1268951
< 0.1%
1268941
< 0.1%
1268931
< 0.1%
1268871
< 0.1%
1268831
< 0.1%
1268821
< 0.1%
1268801
< 0.1%
1268781
< 0.1%
1268771
< 0.1%

procedencia
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Nacional
87594 
Extranjero
 
219

Length

Max length10
Median length8
Mean length8.004987872
Min length8

Characters and Unicode

Total characters702942
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNacional
2nd rowExtranjero
3rd rowNacional
4th rowNacional
5th rowNacional

Common Values

ValueCountFrequency (%)
Nacional87594
99.8%
Extranjero219
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
nacional87594
99.8%
extranjero219
 
0.2%

Most occurring characters

ValueCountFrequency (%)
a175407
25.0%
o87813
12.5%
n87813
12.5%
N87594
12.5%
c87594
12.5%
i87594
12.5%
l87594
12.5%
r438
 
0.1%
E219
 
< 0.1%
x219
 
< 0.1%
Other values (3)657
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter615129
87.5%
Uppercase Letter87813
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a175407
28.5%
o87813
14.3%
n87813
14.3%
c87594
14.2%
i87594
14.2%
l87594
14.2%
r438
 
0.1%
x219
 
< 0.1%
t219
 
< 0.1%
j219
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N87594
99.8%
E219
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin702942
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a175407
25.0%
o87813
12.5%
n87813
12.5%
N87594
12.5%
c87594
12.5%
i87594
12.5%
l87594
12.5%
r438
 
0.1%
E219
 
< 0.1%
x219
 
< 0.1%
Other values (3)657
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII702942
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a175407
25.0%
o87813
12.5%
n87813
12.5%
N87594
12.5%
c87594
12.5%
i87594
12.5%
l87594
12.5%
r438
 
0.1%
E219
 
< 0.1%
x219
 
< 0.1%
Other values (3)657
 
0.1%

genero
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Femenino
47760 
Masculino
40053 

Length

Max length9
Median length8
Mean length8.456116976
Min length8

Characters and Unicode

Total characters742557
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemenino
2nd rowFemenino
3rd rowFemenino
4th rowFemenino
5th rowFemenino

Common Values

ValueCountFrequency (%)
Femenino47760
54.4%
Masculino40053
45.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
femenino47760
54.4%
masculino40053
45.6%

Most occurring characters

ValueCountFrequency (%)
n135573
18.3%
e95520
12.9%
i87813
11.8%
o87813
11.8%
F47760
 
6.4%
m47760
 
6.4%
M40053
 
5.4%
a40053
 
5.4%
s40053
 
5.4%
c40053
 
5.4%
Other values (2)80106
10.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter654744
88.2%
Uppercase Letter87813
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n135573
20.7%
e95520
14.6%
i87813
13.4%
o87813
13.4%
m47760
 
7.3%
a40053
 
6.1%
s40053
 
6.1%
c40053
 
6.1%
u40053
 
6.1%
l40053
 
6.1%
Uppercase Letter
ValueCountFrequency (%)
F47760
54.4%
M40053
45.6%

Most occurring scripts

ValueCountFrequency (%)
Latin742557
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n135573
18.3%
e95520
12.9%
i87813
11.8%
o87813
11.8%
F47760
 
6.4%
m47760
 
6.4%
M40053
 
5.4%
a40053
 
5.4%
s40053
 
5.4%
c40053
 
5.4%
Other values (2)80106
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII742557
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n135573
18.3%
e95520
12.9%
i87813
11.8%
o87813
11.8%
F47760
 
6.4%
m47760
 
6.4%
M40053
 
5.4%
a40053
 
5.4%
s40053
 
5.4%
c40053
 
5.4%
Other values (2)80106
10.8%

estado_civil
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Union Libre
33841 
Casado
26868 
Soltero
25684 
Viudo
 
875
Divorciado
 
545

Length

Max length11
Median length7
Mean length8.234225001
Min length5

Characters and Unicode

Total characters723072
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnion Libre
2nd rowSoltero
3rd rowSoltero
4th rowUnion Libre
5th rowUnion Libre

Common Values

ValueCountFrequency (%)
Union Libre33841
38.5%
Casado26868
30.6%
Soltero25684
29.2%
Viudo875
 
1.0%
Divorciado545
 
0.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
union33841
27.8%
libre33841
27.8%
casado26868
22.1%
soltero25684
21.1%
viudo875
 
0.7%
divorciado545
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o114042
15.8%
i69647
9.6%
n67682
9.4%
r60070
 
8.3%
e59525
 
8.2%
a54281
 
7.5%
U33841
 
4.7%
33841
 
4.7%
L33841
 
4.7%
b33841
 
4.7%
Other values (11)162461
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter567577
78.5%
Uppercase Letter121654
 
16.8%
Space Separator33841
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o114042
20.1%
i69647
12.3%
n67682
11.9%
r60070
10.6%
e59525
10.5%
a54281
9.6%
b33841
 
6.0%
d28288
 
5.0%
s26868
 
4.7%
l25684
 
4.5%
Other values (4)27649
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
U33841
27.8%
L33841
27.8%
C26868
22.1%
S25684
21.1%
V875
 
0.7%
D545
 
0.4%
Space Separator
ValueCountFrequency (%)
33841
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin689231
95.3%
Common33841
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o114042
16.5%
i69647
10.1%
n67682
9.8%
r60070
8.7%
e59525
8.6%
a54281
7.9%
U33841
 
4.9%
L33841
 
4.9%
b33841
 
4.9%
d28288
 
4.1%
Other values (10)134173
19.5%
Common
ValueCountFrequency (%)
33841
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII723072
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o114042
15.8%
i69647
9.6%
n67682
9.4%
r60070
 
8.3%
e59525
 
8.2%
a54281
 
7.5%
U33841
 
4.7%
33841
 
4.7%
L33841
 
4.7%
b33841
 
4.7%
Other values (11)162461
22.5%

edad
Real number (ℝ≥0)

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.22053682
Minimum18
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum18
5-th percentile26
Q135
median44
Q353
95-th percentile65
Maximum98
Range80
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.91329055
Coefficient of variation (CV)0.269406285
Kurtosis-0.3882196952
Mean44.22053682
Median Absolute Deviation (MAD)9
Skewness0.3303862305
Sum3883138
Variance141.9264916
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
443327
 
3.8%
403055
 
3.5%
452843
 
3.2%
412832
 
3.2%
492753
 
3.1%
352643
 
3.0%
462581
 
2.9%
422531
 
2.9%
392477
 
2.8%
312469
 
2.8%
Other values (63)60302
68.7%
ValueCountFrequency (%)
187
 
< 0.1%
1960
 
0.1%
20198
 
0.2%
21277
 
0.3%
22421
 
0.5%
23591
0.7%
24696
0.8%
251043
1.2%
261312
1.5%
271475
1.7%
ValueCountFrequency (%)
982
 
< 0.1%
903
 
< 0.1%
891
 
< 0.1%
882
 
< 0.1%
871
 
< 0.1%
856
 
< 0.1%
8445
0.1%
8350
0.1%
8227
< 0.1%
8141
< 0.1%

municipio_residencia
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
ARAUCA
37992 
TAME
23392 
Otros
13287 
ARAUQUITA
9235 
SARAVENA
3907 

Length

Max length9
Median length6
Mean length5.720405862
Min length4

Characters and Unicode

Total characters502326
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUQUITA
2nd rowARAUCA
3rd rowARAUCA
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA37992
43.3%
TAME23392
26.6%
Otros13287
 
15.1%
ARAUQUITA9235
 
10.5%
SARAVENA3907
 
4.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca37992
43.3%
tame23392
26.6%
otros13287
 
15.1%
arauquita9235
 
10.5%
saravena3907
 
4.4%

Most occurring characters

ValueCountFrequency (%)
A176794
35.2%
U56462
 
11.2%
R51134
 
10.2%
C37992
 
7.6%
T32627
 
6.5%
E27299
 
5.4%
M23392
 
4.7%
O13287
 
2.6%
t13287
 
2.6%
r13287
 
2.6%
Other values (7)56765
 
11.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter449178
89.4%
Lowercase Letter53148
 
10.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A176794
39.4%
U56462
 
12.6%
R51134
 
11.4%
C37992
 
8.5%
T32627
 
7.3%
E27299
 
6.1%
M23392
 
5.2%
O13287
 
3.0%
Q9235
 
2.1%
I9235
 
2.1%
Other values (3)11721
 
2.6%
Lowercase Letter
ValueCountFrequency (%)
t13287
25.0%
r13287
25.0%
o13287
25.0%
s13287
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin502326
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A176794
35.2%
U56462
 
11.2%
R51134
 
10.2%
C37992
 
7.6%
T32627
 
6.5%
E27299
 
5.4%
M23392
 
4.7%
O13287
 
2.6%
t13287
 
2.6%
r13287
 
2.6%
Other values (7)56765
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII502326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A176794
35.2%
U56462
 
11.2%
R51134
 
10.2%
C37992
 
7.6%
T32627
 
6.5%
E27299
 
5.4%
M23392
 
4.7%
O13287
 
2.6%
t13287
 
2.6%
r13287
 
2.6%
Other values (7)56765
 
11.3%

municipio_nacimiento
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Otros
33755 
ARAUCA
28595 
TAME
14455 
ARAUQUITA
6424 
SARAVENA
4584 

Length

Max length9
Median length5
Mean length5.61025133
Min length4

Characters and Unicode

Total characters492653
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUCA
2nd rowARAUCA
3rd rowARAUCA
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
Otros33755
38.4%
ARAUCA28595
32.6%
TAME14455
16.5%
ARAUQUITA6424
 
7.3%
SARAVENA4584
 
5.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
otros33755
38.4%
arauca28595
32.6%
tame14455
16.5%
arauquita6424
 
7.3%
saravena4584
 
5.2%

Most occurring characters

ValueCountFrequency (%)
A133264
27.1%
U41443
 
8.4%
R39603
 
8.0%
O33755
 
6.9%
t33755
 
6.9%
r33755
 
6.9%
o33755
 
6.9%
s33755
 
6.9%
C28595
 
5.8%
T20879
 
4.2%
Other values (7)60094
12.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter357633
72.6%
Lowercase Letter135020
 
27.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A133264
37.3%
U41443
 
11.6%
R39603
 
11.1%
O33755
 
9.4%
C28595
 
8.0%
T20879
 
5.8%
E19039
 
5.3%
M14455
 
4.0%
Q6424
 
1.8%
I6424
 
1.8%
Other values (3)13752
 
3.8%
Lowercase Letter
ValueCountFrequency (%)
t33755
25.0%
r33755
25.0%
o33755
25.0%
s33755
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin492653
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A133264
27.1%
U41443
 
8.4%
R39603
 
8.0%
O33755
 
6.9%
t33755
 
6.9%
r33755
 
6.9%
o33755
 
6.9%
s33755
 
6.9%
C28595
 
5.8%
T20879
 
4.2%
Other values (7)60094
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII492653
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A133264
27.1%
U41443
 
8.4%
R39603
 
8.0%
O33755
 
6.9%
t33755
 
6.9%
r33755
 
6.9%
o33755
 
6.9%
s33755
 
6.9%
C28595
 
5.8%
T20879
 
4.2%
Other values (7)60094
12.2%

municipio_expedicion
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
ARAUCA
32964 
Otros
31664 
TAME
14385 
ARAUQUITA
5411 
SARAVENA
3389 

Length

Max length9
Median length5
Mean length5.573833032
Min length4

Characters and Unicode

Total characters489455
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUCA
2nd rowARAUCA
3rd rowARAUCA
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA32964
37.5%
Otros31664
36.1%
TAME14385
16.4%
ARAUQUITA5411
 
6.2%
SARAVENA3389
 
3.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca32964
37.5%
otros31664
36.1%
tame14385
16.4%
arauquita5411
 
6.2%
saravena3389
 
3.9%

Most occurring characters

ValueCountFrequency (%)
A139677
28.5%
U43786
 
8.9%
R41764
 
8.5%
C32964
 
6.7%
O31664
 
6.5%
t31664
 
6.5%
r31664
 
6.5%
o31664
 
6.5%
s31664
 
6.5%
T19796
 
4.0%
Other values (7)53148
 
10.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter362799
74.1%
Lowercase Letter126656
 
25.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A139677
38.5%
U43786
 
12.1%
R41764
 
11.5%
C32964
 
9.1%
O31664
 
8.7%
T19796
 
5.5%
E17774
 
4.9%
M14385
 
4.0%
Q5411
 
1.5%
I5411
 
1.5%
Other values (3)10167
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
t31664
25.0%
r31664
25.0%
o31664
25.0%
s31664
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin489455
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A139677
28.5%
U43786
 
8.9%
R41764
 
8.5%
C32964
 
6.7%
O31664
 
6.5%
t31664
 
6.5%
r31664
 
6.5%
o31664
 
6.5%
s31664
 
6.5%
T19796
 
4.0%
Other values (7)53148
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII489455
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A139677
28.5%
U43786
 
8.9%
R41764
 
8.5%
C32964
 
6.7%
O31664
 
6.5%
t31664
 
6.5%
r31664
 
6.5%
o31664
 
6.5%
s31664
 
6.5%
T19796
 
4.0%
Other values (7)53148
 
10.9%

tipo_persona
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Natural
87813 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters614691
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNatural
2nd rowNatural
3rd rowNatural
4th rowNatural
5th rowNatural

Common Values

ValueCountFrequency (%)
Natural87813
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
natural87813
100.0%

Most occurring characters

ValueCountFrequency (%)
a175626
28.6%
N87813
14.3%
t87813
14.3%
u87813
14.3%
r87813
14.3%
l87813
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter526878
85.7%
Uppercase Letter87813
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a175626
33.3%
t87813
16.7%
u87813
16.7%
r87813
16.7%
l87813
16.7%
Uppercase Letter
ValueCountFrequency (%)
N87813
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin614691
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a175626
28.6%
N87813
14.3%
t87813
14.3%
u87813
14.3%
r87813
14.3%
l87813
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII614691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a175626
28.6%
N87813
14.3%
t87813
14.3%
u87813
14.3%
r87813
14.3%
l87813
14.3%

tiene_casa_propia
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
Si
64886 
No
22927 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters175626
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowSi
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
Si64886
73.9%
No22927
 
26.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
si64886
73.9%
no22927
 
26.1%

Most occurring characters

ValueCountFrequency (%)
S64886
36.9%
i64886
36.9%
N22927
 
13.1%
o22927
 
13.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter87813
50.0%
Lowercase Letter87813
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S64886
73.9%
N22927
 
26.1%
Lowercase Letter
ValueCountFrequency (%)
i64886
73.9%
o22927
 
26.1%

Most occurring scripts

ValueCountFrequency (%)
Latin175626
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S64886
36.9%
i64886
36.9%
N22927
 
13.1%
o22927
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII175626
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S64886
36.9%
i64886
36.9%
N22927
 
13.1%
o22927
 
13.1%

sueldo_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct514
Distinct (%)0.6%
Missing8704
Missing (%)9.9%
Infinite0
Infinite (%)0.0%
Mean104.3314161
Minimum3
Maximum600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum3
5-th percentile27
Q144
median73
Q3131
95-th percentile290
Maximum600
Range597
Interquartile range (IQR)87

Descriptive statistics

Standard deviation90.01177771
Coefficient of variation (CV)0.8627485472
Kurtosis6.691830251
Mean104.3314161
Median Absolute Deviation (MAD)35
Skewness2.269625533
Sum8253554
Variance8102.120126
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
341981
 
2.3%
361608
 
1.8%
431561
 
1.8%
611451
 
1.7%
721436
 
1.6%
681406
 
1.6%
301360
 
1.5%
761253
 
1.4%
461244
 
1.4%
651244
 
1.4%
Other values (504)64565
73.5%
(Missing)8704
 
9.9%
ValueCountFrequency (%)
313
 
< 0.1%
43
 
< 0.1%
518
 
< 0.1%
628
 
< 0.1%
758
 
0.1%
824
 
< 0.1%
927
 
< 0.1%
10146
0.2%
1142
 
< 0.1%
1237
 
< 0.1%
ValueCountFrequency (%)
600277
0.3%
5992
 
< 0.1%
5971
 
< 0.1%
5941
 
< 0.1%
5931
 
< 0.1%
58849
 
0.1%
5866
 
< 0.1%
5846
 
< 0.1%
58215
 
< 0.1%
57612
 
< 0.1%

otros_ingresos_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct219
Distinct (%)3.6%
Missing81706
Missing (%)93.0%
Infinite0
Infinite (%)0.0%
Mean54.82953987
Minimum1
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum1
5-th percentile8
Q120
median36
Q368
95-th percentile174
Maximum300
Range299
Interquartile range (IQR)48

Descriptive statistics

Standard deviation56.97399081
Coefficient of variation (CV)1.039111234
Kurtosis6.301325849
Mean54.82953987
Median Absolute Deviation (MAD)19
Skewness2.399975784
Sum334844
Variance3246.035629
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34210
 
0.2%
36186
 
0.2%
17184
 
0.2%
18174
 
0.2%
23148
 
0.2%
20145
 
0.2%
13136
 
0.2%
21136
 
0.2%
30133
 
0.2%
26120
 
0.1%
Other values (209)4535
 
5.2%
(Missing)81706
93.0%
ValueCountFrequency (%)
11
 
< 0.1%
21
 
< 0.1%
349
0.1%
429
 
< 0.1%
517
 
< 0.1%
674
0.1%
788
0.1%
851
0.1%
946
0.1%
10112
0.1%
ValueCountFrequency (%)
30095
0.1%
2948
 
< 0.1%
29110
 
< 0.1%
2891
 
< 0.1%
2883
 
< 0.1%
2792
 
< 0.1%
2761
 
< 0.1%
2715
 
< 0.1%
2683
 
< 0.1%
2642
 
< 0.1%

municipio_credito
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
ARAUCA
39350 
TAME
23880 
Otros
10369 
ARAUQUITA
8765 
SARAVENA
5449 

Length

Max length9
Median length6
Mean length5.761584276
Min length4

Characters and Unicode

Total characters505942
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUQUITA
2nd rowARAUCA
3rd rowARAUCA
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA39350
44.8%
TAME23880
27.2%
Otros10369
 
11.8%
ARAUQUITA8765
 
10.0%
SARAVENA5449
 
6.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca39350
44.8%
tame23880
27.2%
otros10369
 
11.8%
arauquita8765
 
10.0%
saravena5449
 
6.2%

Most occurring characters

ValueCountFrequency (%)
A184572
36.5%
U56880
 
11.2%
R53564
 
10.6%
C39350
 
7.8%
T32645
 
6.5%
E29329
 
5.8%
M23880
 
4.7%
O10369
 
2.0%
t10369
 
2.0%
r10369
 
2.0%
Other values (7)54615
 
10.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter464466
91.8%
Lowercase Letter41476
 
8.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A184572
39.7%
U56880
 
12.2%
R53564
 
11.5%
C39350
 
8.5%
T32645
 
7.0%
E29329
 
6.3%
M23880
 
5.1%
O10369
 
2.2%
Q8765
 
1.9%
I8765
 
1.9%
Other values (3)16347
 
3.5%
Lowercase Letter
ValueCountFrequency (%)
t10369
25.0%
r10369
25.0%
o10369
25.0%
s10369
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin505942
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A184572
36.5%
U56880
 
11.2%
R53564
 
10.6%
C39350
 
7.8%
T32645
 
6.5%
E29329
 
5.8%
M23880
 
4.7%
O10369
 
2.0%
t10369
 
2.0%
r10369
 
2.0%
Other values (7)54615
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII505942
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A184572
36.5%
U56880
 
11.2%
R53564
 
10.6%
C39350
 
7.8%
T32645
 
6.5%
E29329
 
5.8%
M23880
 
4.7%
O10369
 
2.0%
t10369
 
2.0%
r10369
 
2.0%
Other values (7)54615
 
10.8%

codeudor
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
SIN CODEUDOR
77099 
CON CODEUDOR
10714 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1053756
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIN CODEUDOR
2nd rowSIN CODEUDOR
3rd rowSIN CODEUDOR
4th rowSIN CODEUDOR
5th rowSIN CODEUDOR

Common Values

ValueCountFrequency (%)
SIN CODEUDOR77099
87.8%
CON CODEUDOR10714
 
12.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
codeudor87813
50.0%
sin77099
43.9%
con10714
 
6.1%

Most occurring characters

ValueCountFrequency (%)
O186340
17.7%
D175626
16.7%
C98527
9.4%
N87813
8.3%
87813
8.3%
E87813
8.3%
U87813
8.3%
R87813
8.3%
S77099
7.3%
I77099
7.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter965943
91.7%
Space Separator87813
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O186340
19.3%
D175626
18.2%
C98527
10.2%
N87813
9.1%
E87813
9.1%
U87813
9.1%
R87813
9.1%
S77099
8.0%
I77099
8.0%
Space Separator
ValueCountFrequency (%)
87813
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin965943
91.7%
Common87813
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
O186340
19.3%
D175626
18.2%
C98527
10.2%
N87813
9.1%
E87813
9.1%
U87813
9.1%
R87813
9.1%
S77099
8.0%
I77099
8.0%
Common
ValueCountFrequency (%)
87813
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1053756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O186340
17.7%
D175626
16.7%
C98527
9.4%
N87813
8.3%
87813
8.3%
E87813
8.3%
U87813
8.3%
R87813
8.3%
S77099
7.3%
I77099
7.3%

sector
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size686.2 KiB
PRIVADO
81059 
PUBLICO
 
6753

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters614684
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVADO
2nd rowPRIVADO
3rd rowPRIVADO
4th rowPRIVADO
5th rowPRIVADO

Common Values

ValueCountFrequency (%)
PRIVADO81059
92.3%
PUBLICO6753
 
7.7%
(Missing)1
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
privado81059
92.3%
publico6753
 
7.7%

Most occurring characters

ValueCountFrequency (%)
P87812
14.3%
I87812
14.3%
O87812
14.3%
R81059
13.2%
V81059
13.2%
A81059
13.2%
D81059
13.2%
U6753
 
1.1%
B6753
 
1.1%
L6753
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter614684
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P87812
14.3%
I87812
14.3%
O87812
14.3%
R81059
13.2%
V81059
13.2%
A81059
13.2%
D81059
13.2%
U6753
 
1.1%
B6753
 
1.1%
L6753
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Latin614684
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P87812
14.3%
I87812
14.3%
O87812
14.3%
R81059
13.2%
V81059
13.2%
A81059
13.2%
D81059
13.2%
U6753
 
1.1%
B6753
 
1.1%
L6753
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII614684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P87812
14.3%
I87812
14.3%
O87812
14.3%
R81059
13.2%
V81059
13.2%
A81059
13.2%
D81059
13.2%
U6753
 
1.1%
B6753
 
1.1%
L6753
 
1.1%

año_credito
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.531675
Minimum1993
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum1993
5-th percentile2004
Q12012
median2016
Q32019
95-th percentile2020
Maximum2021
Range28
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.237870163
Coefficient of variation (CV)0.002600043587
Kurtosis0.5772577467
Mean2014.531675
Median Absolute Deviation (MAD)3
Skewness-1.063953099
Sum176902070
Variance27.43528385
MonotonicityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
202012328
14.0%
201911620
13.2%
20187294
 
8.3%
20156919
 
7.9%
20176327
 
7.2%
20165829
 
6.6%
20145380
 
6.1%
20134800
 
5.5%
20124329
 
4.9%
20113749
 
4.3%
Other values (16)19238
21.9%
ValueCountFrequency (%)
19931
 
< 0.1%
1997202
 
0.2%
1998399
 
0.5%
1999505
 
0.6%
2000587
0.7%
2001856
1.0%
2002855
1.0%
2003833
0.9%
20041034
1.2%
20051442
1.6%
ValueCountFrequency (%)
20211864
 
2.1%
202012328
14.0%
201911620
13.2%
20187294
8.3%
20176327
7.2%
20165829
6.6%
20156919
7.9%
20145380
6.1%
20134800
 
5.5%
20124329
 
4.9%

mes_credito
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.840422261
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.482146677
Coefficient of variation (CV)0.5090543455
Kurtosis-1.218566074
Mean6.840422261
Median Absolute Deviation (MAD)3
Skewness-0.1123632442
Sum600678
Variance12.12534548
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
129286
10.6%
108504
9.7%
117757
8.8%
97586
8.6%
77412
8.4%
67244
8.2%
57187
8.2%
86819
7.8%
36792
7.7%
26631
7.6%
Other values (2)12595
14.3%
ValueCountFrequency (%)
16416
7.3%
26631
7.6%
36792
7.7%
46179
7.0%
57187
8.2%
67244
8.2%
77412
8.4%
86819
7.8%
97586
8.6%
108504
9.7%
ValueCountFrequency (%)
129286
10.6%
117757
8.8%
108504
9.7%
97586
8.6%
86819
7.8%
77412
8.4%
67244
8.2%
57187
8.2%
46179
7.0%
36792
7.7%

valor_credito_smdlv
Real number (ℝ≥0)

HIGH CORRELATION

Distinct579
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.42896838
Minimum0
Maximum700
Zeros785
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum0
5-th percentile3
Q116
median40
Q371
95-th percentile201
Maximum700
Range700
Interquartile range (IQR)55

Descriptive statistics

Standard deviation69.38218407
Coefficient of variation (CV)1.187462076
Kurtosis14.3136109
Mean58.42896838
Median Absolute Deviation (MAD)26
Skewness3.09376473
Sum5130823
Variance4813.887466
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62955
 
3.4%
71947
 
2.2%
51842
 
2.1%
11751
 
2.0%
21671
 
1.9%
151390
 
1.6%
181282
 
1.5%
161247
 
1.4%
171237
 
1.4%
81233
 
1.4%
Other values (569)71258
81.1%
ValueCountFrequency (%)
0785
 
0.9%
11751
2.0%
21671
1.9%
31042
 
1.2%
41198
1.4%
51842
2.1%
62955
3.4%
71947
2.2%
81233
1.4%
9969
 
1.1%
ValueCountFrequency (%)
70053
0.1%
6971
 
< 0.1%
6951
 
< 0.1%
6911
 
< 0.1%
6901
 
< 0.1%
6871
 
< 0.1%
6821
 
< 0.1%
6771
 
< 0.1%
6761
 
< 0.1%
6751
 
< 0.1%

estado_final
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
PAGADO VENCIDO
38150 
PAGADO ANTICIPADO
19882 
CONTADO
17826 
PAGADO A TIEMPO
5705 
DESCUENTO EN VENTA
5231 
Other values (3)
 
1019

Length

Max length18
Median length14
Mean length13.54846093
Min length7

Characters and Unicode

Total characters1189731
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCONTADO
2nd rowCONTADO
3rd rowCONTADO
4th rowCONTADO
5th rowCONTADO

Common Values

ValueCountFrequency (%)
PAGADO VENCIDO38150
43.4%
PAGADO ANTICIPADO19882
22.6%
CONTADO17826
20.3%
PAGADO A TIEMPO5705
 
6.5%
DESCUENTO EN VENTA5231
 
6.0%
DEVOLUCION515
 
0.6%
CARTERA CASTIGADA355
 
0.4%
OTROS CIERRES149
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
pagado63737
37.9%
vencido38150
22.7%
anticipado19882
 
11.8%
contado17826
 
10.6%
a5705
 
3.4%
tiempo5705
 
3.4%
venta5231
 
3.1%
descuento5231
 
3.1%
en5231
 
3.1%
devolucion515
 
0.3%
Other values (4)1008
 
0.6%

Most occurring characters

ValueCountFrequency (%)
A197775
16.6%
O169685
14.3%
D145696
12.2%
N92066
7.7%
P89324
7.5%
I84638
7.1%
C82463
6.9%
80408
6.8%
E65947
 
5.5%
G64092
 
5.4%
Other values (7)117637
9.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1109323
93.2%
Space Separator80408
 
6.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A197775
17.8%
O169685
15.3%
D145696
13.1%
N92066
8.3%
P89324
8.1%
I84638
7.6%
C82463
7.4%
E65947
 
5.9%
G64092
 
5.8%
T54734
 
4.9%
Other values (6)62903
 
5.7%
Space Separator
ValueCountFrequency (%)
80408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1109323
93.2%
Common80408
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A197775
17.8%
O169685
15.3%
D145696
13.1%
N92066
8.3%
P89324
8.1%
I84638
7.6%
C82463
7.4%
E65947
 
5.9%
G64092
 
5.8%
T54734
 
4.9%
Other values (6)62903
 
5.7%
Common
ValueCountFrequency (%)
80408
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1189731
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A197775
16.6%
O169685
14.3%
D145696
12.2%
N92066
7.7%
P89324
7.5%
I84638
7.1%
C82463
6.9%
80408
6.8%
E65947
 
5.5%
G64092
 
5.4%
Other values (7)117637
9.9%

tipo_venta
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
ELECTRODOMESTICOS
87051 
MOTOS
 
751
CONTRATO
 
11

Length

Max length17
Median length17
Mean length16.89624543
Min length5

Characters and Unicode

Total characters1483710
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELECTRODOMESTICOS
2nd rowELECTRODOMESTICOS
3rd rowELECTRODOMESTICOS
4th rowELECTRODOMESTICOS
5th rowELECTRODOMESTICOS

Common Values

ValueCountFrequency (%)
ELECTRODOMESTICOS87051
99.1%
MOTOS751
 
0.9%
CONTRATO11
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
electrodomesticos87051
99.1%
motos751
 
0.9%
contrato11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O262677
17.7%
E261153
17.6%
T174875
11.8%
S174853
11.8%
C174113
11.7%
M87802
 
5.9%
R87062
 
5.9%
L87051
 
5.9%
D87051
 
5.9%
I87051
 
5.9%
Other values (2)22
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1483710
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O262677
17.7%
E261153
17.6%
T174875
11.8%
S174853
11.8%
C174113
11.7%
M87802
 
5.9%
R87062
 
5.9%
L87051
 
5.9%
D87051
 
5.9%
I87051
 
5.9%
Other values (2)22
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin1483710
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O262677
17.7%
E261153
17.6%
T174875
11.8%
S174853
11.8%
C174113
11.7%
M87802
 
5.9%
R87062
 
5.9%
L87051
 
5.9%
D87051
 
5.9%
I87051
 
5.9%
Other values (2)22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1483710
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O262677
17.7%
E261153
17.6%
T174875
11.8%
S174853
11.8%
C174113
11.7%
M87802
 
5.9%
R87062
 
5.9%
L87051
 
5.9%
D87051
 
5.9%
I87051
 
5.9%
Other values (2)22
 
< 0.1%

periodo_credito
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
MENSUAL(ES)
71579 
DIARIA(S)
15224 
SEMANAL(ES)
 
766
QUINCENAL(ES)
 
244

Length

Max length13
Median length11
Mean length10.65882045
Min length9

Characters and Unicode

Total characters935983
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDIARIA(S)
2nd rowDIARIA(S)
3rd rowDIARIA(S)
4th rowDIARIA(S)
5th rowDIARIA(S)

Common Values

ValueCountFrequency (%)
MENSUAL(ES)71579
81.5%
DIARIA(S)15224
 
17.3%
SEMANAL(ES)766
 
0.9%
QUINCENAL(ES)244
 
0.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
mensual(es71579
81.5%
diaria(s15224
 
17.3%
semanal(es766
 
0.9%
quincenal(es244
 
0.3%

Most occurring characters

ValueCountFrequency (%)
S160158
17.1%
E145178
15.5%
A103803
11.1%
(87813
9.4%
)87813
9.4%
N72833
7.8%
L72589
7.8%
M72345
7.7%
U71823
7.7%
I30692
 
3.3%
Other values (4)30936
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter760357
81.2%
Open Punctuation87813
 
9.4%
Close Punctuation87813
 
9.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S160158
21.1%
E145178
19.1%
A103803
13.7%
N72833
9.6%
L72589
9.5%
M72345
9.5%
U71823
9.4%
I30692
 
4.0%
D15224
 
2.0%
R15224
 
2.0%
Other values (2)488
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(87813
100.0%
Close Punctuation
ValueCountFrequency (%)
)87813
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin760357
81.2%
Common175626
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
S160158
21.1%
E145178
19.1%
A103803
13.7%
N72833
9.6%
L72589
9.5%
M72345
9.5%
U71823
9.4%
I30692
 
4.0%
D15224
 
2.0%
R15224
 
2.0%
Other values (2)488
 
0.1%
Common
ValueCountFrequency (%)
(87813
50.0%
)87813
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII935983
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S160158
17.1%
E145178
15.5%
A103803
11.1%
(87813
9.4%
)87813
9.4%
N72833
7.8%
L72589
7.8%
M72345
7.7%
U71823
7.7%
I30692
 
3.3%
Other values (4)30936
 
3.3%

cuotas
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.93397333
Minimum0
Maximum14
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size686.2 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median3
Q310
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)9

Descriptive statistics

Standard deviation4.563636673
Coefficient of variation (CV)0.924941496
Kurtosis-0.8753871904
Mean4.93397333
Median Absolute Deviation (MAD)2
Skewness0.7763494299
Sum433267
Variance20.82677968
MonotonicityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
138167
43.5%
108802
 
10.0%
58462
 
9.6%
147699
 
8.8%
34127
 
4.7%
24078
 
4.6%
123777
 
4.3%
63605
 
4.1%
42860
 
3.3%
92333
 
2.7%
Other values (5)3903
 
4.4%
ValueCountFrequency (%)
01
 
< 0.1%
138167
43.5%
24078
 
4.6%
34127
 
4.7%
42860
 
3.3%
58462
 
9.6%
63605
 
4.1%
7641
 
0.7%
81270
 
1.4%
92333
 
2.7%
ValueCountFrequency (%)
147699
8.8%
13254
 
0.3%
123777
4.3%
111737
 
2.0%
108802
10.0%
92333
 
2.7%
81270
 
1.4%
7641
 
0.7%
63605
4.1%
58462
9.6%

forma_pago
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size686.2 KiB
CRÉDITO
67555 
CONTADO
17826 
LIBRANZA
 
2432

Length

Max length8
Median length7
Mean length7.027695216
Min length7

Characters and Unicode

Total characters617123
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCONTADO
2nd rowCONTADO
3rd rowCONTADO
4th rowCONTADO
5th rowCONTADO

Common Values

ValueCountFrequency (%)
CRÉDITO67555
76.9%
CONTADO17826
 
20.3%
LIBRANZA2432
 
2.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
crédito67555
76.9%
contado17826
 
20.3%
libranza2432
 
2.8%

Most occurring characters

ValueCountFrequency (%)
O103207
16.7%
C85381
13.8%
T85381
13.8%
D85381
13.8%
R69987
11.3%
I69987
11.3%
É67555
10.9%
A22690
 
3.7%
N20258
 
3.3%
L2432
 
0.4%
Other values (2)4864
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter617123
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O103207
16.7%
C85381
13.8%
T85381
13.8%
D85381
13.8%
R69987
11.3%
I69987
11.3%
É67555
10.9%
A22690
 
3.7%
N20258
 
3.3%
L2432
 
0.4%
Other values (2)4864
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin617123
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O103207
16.7%
C85381
13.8%
T85381
13.8%
D85381
13.8%
R69987
11.3%
I69987
11.3%
É67555
10.9%
A22690
 
3.7%
N20258
 
3.3%
L2432
 
0.4%
Other values (2)4864
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII549568
89.1%
Latin 1 Sup67555
 
10.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O103207
18.8%
C85381
15.5%
T85381
15.5%
D85381
15.5%
R69987
12.7%
I69987
12.7%
A22690
 
4.1%
N20258
 
3.7%
L2432
 
0.4%
B2432
 
0.4%
Latin 1 Sup
ValueCountFrequency (%)
É67555
100.0%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontipo_personatiene_casa_propiasueldo_smdlvotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvestado_finaltipo_ventaperiodo_creditocuotasforma_pago
011678NacionalFemeninoUnion Libre28ARAUQUITAARAUCAARAUCANaturalNo68.0NaNARAUQUITASIN CODEUDORPRIVADO2020678CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
111710ExtranjeroFemeninoSoltero31ARAUCAARAUCAARAUCANaturalNo32.0NaNARAUCASIN CODEUDORPRIVADO202132CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
211742NacionalFemeninoSoltero46ARAUCAARAUCAARAUCANaturalSi47.0NaNARAUCASIN CODEUDORPRIVADO2020115CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
311774NacionalFemeninoUnion Libre24ARAUCAARAUCAARAUCANaturalNo68.0NaNARAUCASIN CODEUDORPRIVADO2020116CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
411806NacionalFemeninoUnion Libre24ARAUCAARAUCAARAUCANaturalNo273.030.0ARAUCASIN CODEUDORPRIVADO202066CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
511838NacionalFemeninoUnion Libre26TAMEARAUCAARAUCANaturalNo36.0NaNTAMESIN CODEUDORPRIVADO2019747PAGADO VENCIDOELECTRODOMESTICOSDIARIA(S)14CRÉDITO
611902NacionalFemeninoCasado35ARAUCAARAUCAARAUCANaturalSi215.0NaNARAUCASIN CODEUDORPRIVADO201985CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
711934NacionalFemeninoCasado48ARAUCAARAUCAARAUCANaturalNoNaNNaNARAUCASIN CODEUDORPRIVADO20206270CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
811966NacionalFemeninoSoltero34ARAUCAARAUCAARAUCANaturalNo102.034.0ARAUCASIN CODEUDORPRIVADO2020126CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
911998NacionalFemeninoViudo49ARAUCAARAUCAARAUCANaturalSi72.036.0ARAUCASIN CODEUDORPRIVADO2019475PAGADO VENCIDOELECTRODOMESTICOSDIARIA(S)14CRÉDITO

Last rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontipo_personatiene_casa_propiasueldo_smdlvotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvestado_finaltipo_ventaperiodo_creditocuotasforma_pago
87803126761NacionalMasculinoDivorciado46TAMETAMETAMENaturalSi152.0NaNTAMESIN CODEUDORPRIVADO2013347PAGADO ANTICIPADOELECTRODOMESTICOSDIARIA(S)1CRÉDITO
87804126825NacionalMasculinoUnion Libre53ARAUQUITAARAUQUITAARAUQUITANaturalSi85.0NaNARAUQUITASIN CODEUDORPRIVADO20201046CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87805126527NacionalMasculinoCasado42OtrosOtrosOtrosNaturalSi54.0NaNOtrosSIN CODEUDORPRIVADO2019936CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87806126623NacionalMasculinoUnion Libre27ARAUCAARAUQUITAARAUQUITANaturalNo68.0NaNARAUCASIN CODEUDORPRIVADO2020983CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87807126655NacionalMasculinoUnion Libre26TAMETAMETAMENaturalSi6.0NaNTAMESIN CODEUDORPRIVADO201532PAGADO ANTICIPADOELECTRODOMESTICOSDIARIA(S)1CRÉDITO
87808126687NacionalMasculinoCasado28ARAUQUITAARAUQUITAARAUQUITANaturalSi31.0NaNARAUQUITASIN CODEUDORPRIVADO202091CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87809126719NacionalMasculinoSoltero42ARAUCATAMETAMENaturalSi262.0NaNARAUCASIN CODEUDORPRIVADO2010766CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87810126751NacionalMasculinoUnion Libre37ARAUQUITAARAUQUITAARAUQUITANaturalSiNaNNaNARAUQUITASIN CODEUDORPRIVADO20201014CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87811126783NacionalMasculinoCasado60ARAUQUITAARAUQUITAARAUQUITANaturalSi36.0NaNARAUQUITASIN CODEUDORPRIVADO201995CONTADOELECTRODOMESTICOSDIARIA(S)1CONTADO
87812126815NacionalMasculinoUnion Libre48TAMETAMETAMENaturalSi97.0NaNTAMESIN CODEUDORPRIVADO20142107PAGADO VENCIDOELECTRODOMESTICOSDIARIA(S)14CRÉDITO